In [1]:
import numpy as np
import pandas as pd

Read in your label data:

In [2]:
labels = pd.read_csv('labels.csv')
labels.head(10)

Unnamed: 0,rad1,rad2,rad3,biopsy
0,benign,benign,benign,benign
1,benign,benign,benign,benign
2,benign,benign,benign,benign
3,benign,benign,benign,benign
4,benign,benign,cancer,benign
5,cancer,cancer,cancer,cancer
6,benign,benign,benign,benign
7,benign,benign,benign,benign
8,cancer,cancer,benign,cancer
9,benign,benign,cancer,benign


## Create your first ground truth as derived from biopsy labels: 

In [3]:
## I'm going to replace everything in my 'labels' dataframe with 0's and 1's for easier processing later:
labels2 = labels.replace('benign',1).replace('cancer',0)
labels2.head(10)

Unnamed: 0,rad1,rad2,rad3,biopsy
0,1,1,1,1
1,1,1,1,1
2,1,1,1,1
3,1,1,1,1
4,1,1,0,1
5,0,0,0,0
6,1,1,1,1
7,1,1,1,1
8,0,0,1,0
9,1,1,0,1


In [5]:
gt1 = labels2['biopsy']
gt1.head()

0    1
1    1
2    1
3    1
4    1
Name: biopsy, dtype: int64

## Create your second truth by voting system from the three radiologists:

In [6]:
gt2 = labels2[['rad1','rad2','rad3']].sum(axis=1)
gt2 = (gt2 > 1).replace(True,1).replace(False,0)
gt2.head()

0    1.0
1    1.0
2    1.0
3    1.0
4    1.0
dtype: float64

## Create your third ground truth by weighting the three radiologists:

In [7]:
weighted_labels = labels2.copy()
weighted_labels['rad2'] = weighted_labels['rad2'] * 0.67
weighted_labels['rad1'] = weighted_labels['rad1'] * 0.33
weighted_labels.head()

Unnamed: 0,rad1,rad2,rad3,biopsy
0,0.33,0.67,1,1
1,0.33,0.67,1,1
2,0.33,0.67,1,1
3,0.33,0.67,1,1
4,0.33,0.67,0,1


In [8]:
gt3 = weighted_labels[['rad1','rad2','rad3']].sum(axis=1)
gt3 = (gt3 > 1).replace(True,1).replace(False,0)
gt3.head()

0    1.0
1    1.0
2    1.0
3    1.0
4    0.0
dtype: float64

## Compare the three ground truths:

Here, just explore the three sets of labels you created and see how often they agree

In [9]:
biopsy_to_votes = gt1 == gt2
biopsy_to_votes[biopsy_to_votes==False]

12    False
14    False
22    False
29    False
30    False
34    False
37    False
52    False
57    False
dtype: bool

In [10]:
biopsy_to_weights = gt1 == gt3
biopsy_to_weights[biopsy_to_weights==False]

4     False
9     False
12    False
14    False
17    False
20    False
22    False
29    False
30    False
34    False
37    False
52    False
56    False
57    False
58    False
dtype: bool

Interestingly, in the example above the weighting example performs worse against biopsy labels than simple voting. This may be an artefact of the weightings that we chose, and is not always sub-optimal to simple voting. 